August 08, 2019

Proudly supported by the
people of Western Australia
through Channel 7's Telethon

R-Fundamentals

What we will cover

  • OveRview
    • What is R, brief history
  • How to get setup and use R
  • Fundamentals of R
    • Basic syntax
    • Data classes and structures [Main Section]
    • Reading data in
    • Installing/loading packages
  • Staying up-to-date in R/further reading

OveRview

What R is

R is:

  • A statistical programming, computation, and graphics system,
  • Its own language (code-based programming/commands),
  • A cross platform system (windows, mac, unix-like),
  • Freely available online,
  • A modular system

Great, so what is R?

R is an environment where you can read data in, carry out data manipulation, analyse your data, and prepare your results for others to see.

Me. (Today)

Great, so what is R?

R is an environment where you can:

  • read (pull) data in (from a variety of sources),
  • carry out (very complex) data manipulation,
  • analyse your data (via a broad and rapidly increasing suite of statistical models), and
  • prepare (reproducibly) your results (figures, tables, reports, papers) for others to see (or interact with).
Me. (Today)

A very brief history

  • R first appeared in 1993
  • Originally written by Ross Ihaka and Robert Gentleman (University of Auckland, NZ)
  • Influenced by the existing languages S and Scheme
  • Mid 1997, ‘R core team’ established - maintain the base code
  • R CRAN (Comprehensive R Archive Network)
    • A collection of sites (mirrored daily around the world) that manages the distribution of R and its packages
  • R V1.0 released 2000-02-29 (stable)

R more recently

  • Over 10,000 R ‘add-on’ packages (libraries) on CRAN
    • +bioconductor, +github etc
  • RStudio initially released in 2011
    • integrated development environment
  • Annual conference
    • useR! (2018 Brisbane, 2019 Toulouse)
  • R-Ladies

R more recently

  • 200? enter Hadley Wickham
    • ggplot package
    • plyr package
  • Chief Scientist at RStudio
    • Kiwi
    • Developed many packages
    • Written many books (R for Data Science, Advanced R)
      • Freely available online

R more recently

  • Tidyverse
    • controversy
    • base vs tidyverse

“The tidyverse is an opinionated collection of R packages designed for data science. All packages share an underlying design philosophy, grammar, and data structures.”

What R isn’t?

R is not a:

  • data collection tool
  • (permanent) data storage system

R - How to get setup and use R

But first - Disclaimer

  • There are many ways to use R
  • There are many ways to do most operations R
  • This presentation focuses on the collective ‘best practice’ advice of our team of daily R users
  • R is rapidly evolving
  • Online help/guides are often out-of-date

Interacting with R

  • R is an interactive language
    • Type a command, submit, immediate response (line-by-line)
  • Typically, commands saved in a plain text file (.R)
  • Commands can be executed in bulk
    • Large scale analysis/data manipulation
    • ‘one-click’ reports (like these slides)

And - a heads up

  • R is a cAsE sensitive language
  • R is largely forgiving with where white space is (or isn’t!)
    • including line breaks, bracket placement etc

Installing R

RStudio IDE

RStudio IDE

  • free and open-source integrated development environment for R
  • Screen divided into quarters, typically
    • One for your code (commands)
    • One for your console (commands executed/results shown)
    • One for your environment management
    • One for your plots/help

RStudio IDE

  • Customisable and quite diverse
  • Ready access to:
    • Terminal, code repository, package management, file management
  • Other benefits including
    • Code highlighting (full-featured)
    • Customisable settings (load/save session)
    • Tab completion

RStudio IDE

R - Fundamentals of R

Basic syntax

Basic syntax - assign

  • Assign ‘<-
    • less than sign followed by dash
    • puts the right hand side into the left hand side

a <- 3

  • a will now be 3 (which we’ll see shortly)
  • You can use ‘=’ in place of ‘<-
  • Typically we only use ‘=’ within brackets
    • It’s complicated and related to environments (scoping)

Basic syntax - operations

  • basic math
3+3
## [1] 6
3*3
## [1] 9

Basic syntax - operations

a <- 3
a + a
## [1] 6

Basic syntax - brackets

  • Round brackets (something) | (x, y)
    • multiple things inside, separated by a comma
    • typically used to group/provide arguments for a function
sum(1, 2, 3, 4, NA, 5, na.rm = T)
## [1] 15

Basic syntax - brackets

  • Square brackets [something] | [ ] [,] [[ ]]
    • multiple things inside, separated by a comma
    • typically used for indices when navigating within a structure
c("one", "two", "three")[2]
## [1] "two"

Basic syntax - brackets

  • Curly brackets {something}
    • groups multiple lines of code
    • typically used when creating functions
## [1] "matt is awesome"

Basic syntax - other

  • Colon : used to create a sequence
    • 1:4 produces 1 2 3 4
    • double colon to access functions in libraries that aren’t loaded
  • Hashtag for comments # For this next bit of code online somewhere
    • Code won’t be executed
  • Question mark ?something for help
    • Double question mark to search for commands

Basic syntax - other

  • The letters T and F are ‘reserved’ - logicals
    • indicate TRUE and FALSE respectively, don’t overwrite them
    • used regularly within arguments
  • Working directory
    • getwd() and setwd()
    • best practise is using relative directory paths!

Classes

Classes

  • Each data point (value within a cell) will have a class.
  • There are many classes, most (and the common ones) are nice and clear.
  • Classes have, somewhat of, a hierarchy
    • character is King/Queen amongst many
  • Classes are closely related to Structures (the next topic).
  • Functions will respond differently based on the class of data you feed them.
  • Sometimes referred to as mode

Classes - character

  • character is the highest-level ‘catch-all’ class
  • Free text fields will be of class character
  • Any* data point that has something other than numbers in it will default to being of class character

*data points with non-nuermic (alpha) characters can be made into other classes (more to come)

Classes - character

x <- "Teach me R"
x
## [1] "Teach me R"
class(x)
## [1] "character"
  • Note the quote marks

Classes - character

x <- Teach me R
## Error: <text>:1:12: unexpected symbol
## 1: x <- Teach me
##                ^
  • Note the lack of quote marks

Classes - character

x <- "Teach me R"
y <- "Teach me now"
x + y
## Error in x + y: non-numeric argument to binary operator

Classes - character

x <- "3"
class(x)
## [1] "character"
  • Interesting?

Classes - numeric (integer/double)

  • There are three classes that can all be thought of as “numbers”
    • numeric
    • double (double precision floating point numbers - computer format)
    • integer (no decimals)
  • R may convert between the three as it sees fit

  • “it is perfectly feasible to use R successfully for years and not need to know the answer to this question” - stackoverflow

Classes - numeric (integer/double)

x <- "3"
x
## [1] "3"
class(x)
## [1] "character"
  • Oh, quotes!

Classes - numeric (integer/double)

x <- 3
x
## [1] 3
class(x)
## [1] "numeric"
x + 2
## [1] 5

Classes - numeric (integer/double)

x <- as.integer(3.1)
x
## [1] 3
class(x)
## [1] "integer"
  • Notice the decimal has dropped off?

Classes - numeric (integer/double)

x + 0.4
## [1] 3.4
class(x + 0.4)
## [1] "numeric"
  • Notice the decimal has dropped off

Classes - factor

  • Class factor is used when there is a limited set of responses
    • e.g. likert scales, States, income brackets
  • Generally have labels, associated with levels (the responses)
  • Always have an underlying number linked to each label/level
  • Can be ordered, but don’t have to be
    • Default behaviour is to handle alphabetically
  • Useful for creating tables, coefficients in models etc

Classes - factor

y <- c("Cat", "Other", "Dog", "Dog"); y
## [1] "Cat"   "Other" "Dog"   "Dog"
class(y)
## [1] "character"
y <- as.factor(c("Cat", "Other", "Dog", "Dog")); y
## [1] Cat   Other Dog   Dog  
## Levels: Cat Dog Other
  • note, the order of the levels

Classes - factor

class(y); levels(y)
## [1] "factor"
## [1] "Cat"   "Dog"   "Other"

Classes - factor

y <- y[-1]  # get rid of the cat
table(y)
## y
##   Cat   Dog Other 
##     0     2     1
y <- as.factor(y); levels(y)
## [1] "Cat"   "Dog"   "Other"
y <- factor(y); levels(y)
## [1] "Dog"   "Other"

Classes - factor

  • review ?factor
  • as.factor() does not accept additional arguments
    • see levels defaults.
  • relevel lets you set the ‘baseline’ level
    • very handy for modeling

Classes - factor - PROBABLY REMOVE PROBABLY REMOVE PROBABLY REMOVE PROBABLY REMOVE PROBABLY REMOVE PROBABLY REMOVE PROBABLY REMOVE

class(tki_demo$intervention)
## [1] "factor"
table(tki_demo$intervention)
## 
## Placebo  Drug 1  Drug 2 
##      29      38      33
levels(tki_demo$intervention)      
## [1] "Placebo" "Drug 1"  "Drug 2"
levels(factor(as.character(tki_demo$intervention)))
## [1] "Drug 1"  "Drug 2"  "Placebo"

Classes - factor - PROBABLY REMOVE PROBABLY REMOVE PROBABLY REMOVE PROBABLY REMOVE PROBABLY REMOVE PROBABLY REMOVE PROBABLY REMOVE

y <- tki_demo$intervention[1:5]
y           # retained class and levels
## [1] Drug 2  Drug 2  Drug 2  Placebo Drug 1 
## Levels: Placebo Drug 1 Drug 2
y <- y[-5]  # get rid of the 'Drug 1'
table(y)
## y
## Placebo  Drug 1  Drug 2 
##       1       0       3
y <- as.factor(y); levels(y)
## [1] "Placebo" "Drug 1"  "Drug 2"
y <- factor(y); levels(y)
## [1] "Placebo" "Drug 2"

Classes - logical

  • Class logical variables can only take two*, TRUE or FALSE
  • Can usefully be abbreviated as T and F
  • Generally used as a selection mechanism
    • records in/out
    • conditional (if) statements
  • Work well with sum(), any(), all()
  • Be aware of combined logicals!
    • F + F + NA = ?
    • all(c(T, NA, T)) = ?

Classes - logical

y <- c(TRUE, FALSE, TRUE, FALSE, FALSE)
class(y)
## [1] "logical"
table(y)
## y
## FALSE  TRUE 
##     3     2

Classes - logical

F + F + NA
## [1] NA
T + NA + T
## [1] NA
F + F + F
## [1] 0
T + T + T
## [1] 3

Oh!

Classes - logical

table(y)
## y
## FALSE  TRUE 
##     3     2
sum(y)
## [1] 2

Classes - logical

all(c(T, NA, T))
## [1] NA
all(c(F, NA, F))
## [1] FALSE
any(c(T, NA, T))
## [1] TRUE
any(c(F, NA, F))
## [1] NA
all(c(T, F, T))
## [1] FALSE

Classes - date

  • Class date follows relatively strict formatting
  • Can also include time
    • via ‘associated’ classes POSIXct, POSIXlt
    • be aware of timezones
  • Relatively nice syntax for coverting strings to dates
  • Has many helper funtions to aid date math
  • Lubridate!
  • Go slow!

Classes - date

  • ?strptime can be your friend
    • Date-time Conversion Functions to and from Character

%d Day of the month as decimal number (01–31).

%e Day of the month as decimal number (1–31), …

%H Hours as decimal number (00–23). …

%y Year without century (00–99). On input, values 00 to 68 are prefixed by 20 and 69 to 99 by 19…

%Y Year with century. Note that whereas there was no zero in the original Gregorian calendar…

Classes - date

  • Note the structure
Sys.Date()
## [1] "2019-07-26"
class(as.Date("2019-01-31"))
## [1] "Date"
class(as.Date("31-01-2019"))
## [1] "Date"
class(as.Date("31.01.2019"))
## Error in charToDate(x): character string is not in a standard unambiguous format

Classes - date

class(as.Date("31.01.2019", format = "%d.%m.%Y"))
## [1] "Date"

Classes - date

  • library(lubridate)
library(lubridate); class(dmy("31.01.2019"))
## [1] "Date"
  • complex (powerful?) new classes
    • duration, interval, period
round(interval(ymd("1983-10-16"), Sys.Date()) / years(1), 2)
## [1] 35.78
as.duration(interval(ymd("1983-10-16"), Sys.Date()))
## [1] "1128988800s (~35.78 years)"
as.period(interval(ymd("1983-10-16"), Sys.Date()))
## [1] "35y 9m 10d 0H 0M 0S"

Data structures

Data structures

  • Structure is our term, it is not an R term
  • Sometimes referred to as type
  • Some structures are a class within R
  • But not all

Single element (scalar typically) - most basic building block

i <- 1

Structures - vector

  • at least two elements
  • a vector is not a class
  • a vector HAS a class, and it can have only one! (remember the hierarchy)
  • the power of R, is vectorised processing

Structures - vector

x <- c(3,4,5,6,8) # c stands for combine/concatenate
x
## [1] 3 4 5 6 8
x + 2
## [1]  5  6  7  8 10
length(x)
## [1] 5
sum(x)
## [1] 26
y <- c("Cat", "Dog", "Other", "Dog")
length(y)
## [1] 4
sum(y)
## Error in sum(y): invalid 'type' (character) of argument

Structures - data.frame

  • Likely the main* structure you will work with in R
  • Best ‘pictured’ as a table (two-dimensional), with rows and columns
  • Best ‘thought of’ as:
    • a number of | vertical | vectors, side-by-side, bound together, all of equal length
  • All elements of a column will be off the same class!
  • When reading in a csv, a plain text file, (a worksheet of) an excel file
    • it will generally become available to you as a data.frame
  • standby for tibbles

Structures - tibble

  • A tibble is functionally very similar to a data.frame
    • Part of the tidyverse
    • Has a “handy” reduced display format

We’ll come back to this

Demonstration data set(s)

class(tki_demo)
## [1] "data.frame"
dim(tki_demo)
## [1] 100   8

Demonstration data set(s)

head(tki_demo) # can you guess what tail does?
##   id        dob  male smoker intervention      day1      day2     day3
## 1  1 2004-12-08  TRUE  FALSE       Drug 2  3.787324 19.379647 29.63681
## 2  2 2007-06-14 FALSE   TRUE       Drug 2  1.200292 28.770240       NA
## 3  3 2003-01-05  TRUE  FALSE       Drug 2  6.321257 22.820082 39.02544
## 4  4 2002-09-14 FALSE  FALSE      Placebo -1.302337  4.610366  9.40575
## 5  5 2003-10-24 FALSE  FALSE       Drug 1  7.793055 19.879646 14.73297
## 6  6 2009-03-06  TRUE  FALSE       Drug 1 11.310929 12.648613       NA
names(tki_demo)
## [1] "id"           "dob"          "male"         "smoker"      
## [5] "intervention" "day1"         "day2"         "day3"

Structures - data.frame

str(tki_demo)
## 'data.frame':    100 obs. of  8 variables:
##  $ id          : int  1 2 3 4 5 6 7 8 9 10 ...
##  $ dob         : Date, format: "2004-12-08" "2007-06-14" ...
##  $ male        : logi  TRUE FALSE TRUE FALSE FALSE TRUE ...
##  $ smoker      : logi  FALSE TRUE FALSE FALSE FALSE FALSE ...
##  $ intervention: Factor w/ 3 levels "Placebo","Drug 1",..: 3 3 3 1 2 2 3 1 1 1 ...
##  $ day1        : num  3.79 1.2 6.32 -1.3 7.79 ...
##  $ day2        : num  19.38 28.77 22.82 4.61 19.88 ...
##  $ day3        : num  29.64 NA 39.03 9.41 14.73 ...

Structures - data.frame

summary(tki_demo[, 1:4])
##        id              dob                male           smoker       
##  Min.   :  1.00   Min.   :1899-02-25   Mode :logical   Mode :logical  
##  1st Qu.: 25.75   1st Qu.:2003-06-06   FALSE:65        FALSE:72       
##  Median : 50.50   Median :2005-07-12   TRUE :35        TRUE :28       
##  Mean   : 50.50   Mean   :2004-05-07                                  
##  3rd Qu.: 75.25   3rd Qu.:2007-04-02                                  
##  Max.   :100.00   Max.   :2009-04-13

Structures - data.frame

head(tki_demo)
##   id        dob  male smoker intervention      day1      day2     day3
## 1  1 2004-12-08  TRUE  FALSE       Drug 2  3.787324 19.379647 29.63681
## 2  2 2007-06-14 FALSE   TRUE       Drug 2  1.200292 28.770240       NA
## 3  3 2003-01-05  TRUE  FALSE       Drug 2  6.321257 22.820082 39.02544
## 4  4 2002-09-14 FALSE  FALSE      Placebo -1.302337  4.610366  9.40575
## 5  5 2003-10-24 FALSE  FALSE       Drug 1  7.793055 19.879646 14.73297
## 6  6 2009-03-06  TRUE  FALSE       Drug 1 11.310929 12.648613       NA
tki_demo[1:2 , 1:3]
##   id        dob  male
## 1  1 2004-12-08  TRUE
## 2  2 2007-06-14 FALSE
tki_demo[1 , 1]
## [1] 1

Structures - data.frame

tki_demo$day1[1:10]
##  [1]  3.787324  1.200292  6.321257 -1.302337  7.793055 11.310929  8.389122
##  [8]  9.046324  6.428155  3.388006
tki_demo$index <- 1:10
tki_demo$index <- 1:nrow(tki_demo)
head(tki_demo)
##   id        dob  male smoker intervention      day1      day2     day3
## 1  1 2004-12-08  TRUE  FALSE       Drug 2  3.787324 19.379647 29.63681
## 2  2 2007-06-14 FALSE   TRUE       Drug 2  1.200292 28.770240       NA
## 3  3 2003-01-05  TRUE  FALSE       Drug 2  6.321257 22.820082 39.02544
## 4  4 2002-09-14 FALSE  FALSE      Placebo -1.302337  4.610366  9.40575
## 5  5 2003-10-24 FALSE  FALSE       Drug 1  7.793055 19.879646 14.73297
## 6  6 2009-03-06  TRUE  FALSE       Drug 1 11.310929 12.648613       NA
##   index
## 1     1
## 2     2
## 3     3
## 4     4
## 5     5
## 6     6

Structures - tibble

  • a tibble is the data.frame of the tidyverse
  • broady the same as a data.frame, but less unruly (print)
load("../data/dat.RData"); head(tki_demo, 3)
## # A tibble: 3 x 8
##      id dob        male  smoker intervention  day1  day2  day3
##   <int> <date>     <lgl> <lgl>  <fct>        <dbl> <dbl> <dbl>
## 1     1 2004-12-08 TRUE  FALSE  Drug 2        3.79  19.4  29.6
## 2     2 2007-06-14 FALSE TRUE   Drug 2        1.20  28.8  NA  
## 3     3 2003-01-05 TRUE  FALSE  Drug 2        6.32  22.8  39.0

Structures - tibble

head(tki_demo, 2)
## # A tibble: 2 x 8
##      id dob        male  smoker intervention  day1  day2  day3
##   <int> <date>     <lgl> <lgl>  <fct>        <dbl> <dbl> <dbl>
## 1     1 2004-12-08 TRUE  FALSE  Drug 2        3.79  19.4  29.6
## 2     2 2007-06-14 FALSE TRUE   Drug 2        1.20  28.8  NA
  • Given the dimensions by default
  • Given the column classes by default
  • Control printing (won’t blow out your console)

Structures - list

  • Best ‘pictured’ as a 3D (deep) (filing cabinet like) structure
    • A vector where each element can have a different class/dimension
  • Lists can be used to keep related things together rather than as separate objects
    • If you had a data.frame of data for each calendar
    • You can apply operations to call calendar years at the same time
  • Lists can be good for collecting output (complex elements) from a loop
    • Model output from running a model on multiple imputed datasets

Structures - list

tki_list <- list(tki_demo,
                 tki_demo_complications)
class(tki_list)
## [1] "list"
head(tki_list[[1]], 2) # access the first element 
## # A tibble: 2 x 8
##      id dob        male  smoker intervention  day1  day2  day3
##   <int> <date>     <lgl> <lgl>  <fct>        <dbl> <dbl> <dbl>
## 1     1 2004-12-08 TRUE  FALSE  Drug 2        3.79  19.4  29.6
## 2     2 2007-06-14 FALSE TRUE   Drug 2        1.20  28.8  NA
head(tki_list[[2]], 2) # access the second element 
## # A tibble: 2 x 2
##      id complications
##   <int> <chr>        
## 1    12 Yoda speach  
## 2    22 Man flu

Structures - list

  • why? efficiency!
lapply(tki_list, head, 2)
## [[1]]
## # A tibble: 2 x 8
##      id dob        male  smoker intervention  day1  day2  day3
##   <int> <date>     <lgl> <lgl>  <fct>        <dbl> <dbl> <dbl>
## 1     1 2004-12-08 TRUE  FALSE  Drug 2        3.79  19.4  29.6
## 2     2 2007-06-14 FALSE TRUE   Drug 2        1.20  28.8  NA  
## 
## [[2]]
## # A tibble: 2 x 2
##      id complications
##   <int> <chr>        
## 1    12 Yoda speach  
## 2    22 Man flu

Structures - other

Matrix (2-dimensions)

  • A grid like structure (similar to data.frames)
  • Entire strucutre is one class (all numeric or character etc)
  • Can not use $ for referencing columns ([ , ] used)

Array (n-dimensions)

  • Like a list, but must be all of the same class

beyond the scope of this level - computational efficiency

Reading data in

Reading data in

  • The function you use depends on your file type
  • Flat files (plain text files) (.csv, .txt) can be read in with base functions
    • read.csv() and read.table() are commonly use
  • Other files require specific packages (next topic)
    • The readxl package reads in (.xlx, .xlsx) excel files via read_excel()
    • The foreign and haven packages read in SPSS, Stata, and SAS files

Reading data in - arguments

  • header = T
    • Is the top row of your dataset column headers
  • na.strings = c(“NA”, “missing”, “999”)
    • This will mean cells end up as true R based NA values, which is very handy
  • stringsAsFactors = F
    • Can cause issues if not explicitly set as F (ID codes, free text responses)

Reading data in - example

dat <- read.csv(“/DIRECTORY/my_data_file.csv”, header = T, na.strings = c(“NOT CONTACTED”))

or

dat <- read.csv(“../data/my_data_file.csv”, header = T, na.strings = c(“NOT CONTACTED”))

  • This will load the csv file into a data.frame called dat
  • Ensure path is valid (shared drives)

Reading data in - considerations

  • Excel
    • skip lines? data classes (dates?)?
    • which worksheet(s?) did you want?
  • SPSS
    • do you want category labels? or underlying stored values (think factors)?

Reading data in - considerations

  • A single character in a column that isn’t 0-9 or . will mean that column is read in as character
  • You can load data directly from the web

Installing/loading packages

Packages - overview

  • Packages can be thought of as ‘bundles of functions’
    • They can contain datasets and other things too
  • They extend the functionality of R
    • New graphing abilities, modeling methods, faster data manipulation
  • It’s very common for scripts to start by calling many packages
  • Require semi-regular updating (easy)

Packages - installing and loading

  • Installed via: install.packages(“ggplot2”)
    • note the quote marks
    • by default, this will also install that packages dependencies
  • Packages loaded for use via: library(ggplot)
    • note the lack of quote marks
  • Updated via: updated.packages()
    • considered using ‘ask = T

Packages - installing and loading

  • Some packages are installed direct from the developer
    • Tnstalled via devtools::install_github()
    • That is, using the install_github() function within the devtools package
  • Our package is installed this way
    • devtools::install_github(“TelethonKids/biometrics”, build_vignettes = TRUE)

Packages - citing and more

  • Running .libPaths() will show where (on your computer) your packages are stored
  • Running citation(“ggplot2”) will tell you how to cite the package
  • Beware of clashes
    • packages overiding base functions, or two packages with the same function
    • not overly common, and you will get warned
  • Help file will tell you what package a function resides in
    • ?geom_path starts (top left) with geom_path {ggplot2}

Saving within R

  • Save all scripts (.R, .Rmd) so you can *always regenerate your results from your original data files
  • Saving your workspace is commonly done
    • This will include *everything within your current session (/environment)
    • All datasets, model output, custom functions, settings changes - but not loaded libraries
  • Saved via: save.image(“my_session.Rdata”)
  • Loaded via: load(“my_session.Rdata”)

Saving within R

  • Save specific (individual) structures/object on their own (not commonly done)
    • saved via: saveRDS(dat, “my_data_frame.RDS”)
    • loaded via: readRDS(“my_data_frame.RDS”)
  • Exporting data via write commands
    • export via: write.csv(dat, “my_csv_export.csv”)
    • commonly used arguments include row.names = F and na = ""

Staying up-to-date in R

Further reading

Fin

Important Syntax recap

Many of these have been seen during the prior examples: - <- assign, puts the right hand side into the left hand side - ? followed by a command, to search for help on a command - a:b used to generate a series of integers, from a to b - - function( , , …) arguments to a function are seperated by commas - c() concatenate, used to create a vector - c() concatenate, used to create a vector - [ ] [,] [[ ]] used to ‘extract’/interact with componets of a
- - -